Goto

Collaborating Authors

 local latent variable


Robust Variational Bayes by Min-Max Median Aggregation

Yan, Jiawei, Liu, Ju, Liu, Weidong, Tu, Jiyuan

arXiv.org Machine Learning

We propose a robust and scalable variational Bayes (VB) framework designed to effectively handle contamination and outliers in dataset. Our approach partitions the data into $m$ disjoint subsets and formulates a joint optimization problem based on robust aggregation principles. A key insight is that the full posterior distribution is equivalent to the minimizer of the mean Kullback-Leibler (KL) divergence from the $m$-powered local posterior distributions. To enhance robustness, we replace the mean KL divergence with a min-max median formulation. The min-max formulation not only ensures consistency between the KL minimizer and the Evidence Lower Bound (ELBO) maximizer but also facilitates the establishment of improved statistical rates for the mean of variational posterior. We observe a notable discrepancy in the $m$-powered marginal log likelihood function contingent on the presence of local latent variables. To address this, we treat these two scenarios separately to guarantee the consistency of the aggregated variational posterior. Specifically, when local latent variables are present, we introduce an aggregate-and-rescale strategy. Theoretically, we provide a non-asymptotic analysis of our proposed posterior, incorporating a refined analysis of Bernstein-von Mises (BvM) theorem to accommodate a diverging number of subsets $m$. Our findings indicate that the two-stage approach yields a smaller approximation error compared to directly aggregating the $m$-powered local posteriors. Furthermore, we establish a nearly optimal statistical rate for the mean of the proposed posterior, advancing existing theories related to min-max median estimators. The efficacy of our method is demonstrated through extensive simulation studies.


Appendix for When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting Code for E PI FNP and wILI dataset is publicly available

Neural Information Processing Systems

Deep learning is also suitable because it provides the capability of ingesting data from multiple sources, which better informs the model of what is happening on the ground. Our work aims to close this gap in the literature. Existing approaches for uncertainty quantification can be categorized into three lines. The second line tries to combine the stochastic processes and DNNs. The third line is based on model ensembling [24] which trains multiple DNNs with different initializations and use their predictions for uncertainty quantification.


Appendix for When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting Code for E PI FNP and wILI dataset is publicly available

Neural Information Processing Systems

Deep learning is also suitable because it provides the capability of ingesting data from multiple sources, which better informs the model of what is happening on the ground. Our work aims to close this gap in the literature. Existing approaches for uncertainty quantification can be categorized into three lines. The second line tries to combine the stochastic processes and DNNs. The third line is based on model ensembling [24] which trains multiple DNNs with different initializations and use their predictions for uncertainty quantification.


Variational Inference for Latent Variable Models in High Dimensions

Zhong, Chenyang, Mukherjee, Sumit, Sen, Bodhisattva

arXiv.org Machine Learning

In modern applications, these models typically involve a large number of parameters and latent variables, resulting in complex and high-dimensional posteriors that are computationally intractable. For such scenarios, traditional Markov chain Monte Carlo (MCMC) approaches often suffer from lengthy burn-in periods and generally lack scalability [11]. Recently, variational inference (VI) [31, 10, 52, 11] has emerged as a popular and scalable alternative method for approximating intractable posterior distributions in large-scale applications (where the number of observations and dimensionality are both large) and is typically orders of magnitude faster than MCMC methods. Among the various forms of VI, arguably the most widely used and important is mean-field variational inference (MFVI) [52, 11], which approximates the intractable posterior by a product distribution. This approach has been widely adopted in statistics and machine learning, thanks to efficient algorithmic implementations based on coordinate ascent variational inference (CAVI) [10, 11, 19, 7, 5, 36, 14, 34].


Reparameterized Variational Rejection Sampling

Jankowiak, Martin, Phan, Du

arXiv.org Machine Learning

Traditional approaches to variational inference rely on parametric families of variational distributions, with the choice of family playing a critical role in determining the accuracy of the resulting posterior approximation. Simple mean-field families often lead to poor approximations, while rich families of distributions like normalizing flows can be difficult to optimize and usually do not incorporate the known structure of the target distribution due to their black-box nature. To expand the space of flexible variational families, we revisit Variational Rejection Sampling (VRS) [Grover et al., 2018], which combines a parametric proposal distribution with rejection sampling to define a rich non-parametric family of distributions that explicitly utilizes the known target distribution. By introducing a low-variance reparameterized gradient estimator for the parameters of the proposal distribution, we make VRS an attractive inference strategy for models with continuous latent variables. We argue theoretically and demonstrate empirically that the resulting method--Reparameterized Variational Rejection Sampling (RVRS)--offers an attractive trade-off between computational cost and inference fidelity. In experiments we show that our method performs well in practice and that it is well-suited for black-box inference, especially for models with local latent variables.


Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables

Wang, Qi, van Hoof, Herke

arXiv.org Machine Learning

Neural processes (NPs) constitute a family of variational approximate models for stochastic processes with promising properties in computational efficiency and uncertainty quantification. These processes use neural networks with latent variable inputs to induce predictive distributions. However, the expressiveness of vanilla NPs is limited as they only use a global latent variable, while target specific local variation may be crucial sometimes. To address this challenge, we investigate NPs systematically and present a new variant of NP model that we call Doubly Stochastic Variational Neural Process (DSVNP). This model combines the global latent variable and local latent variables for prediction. We evaluate this model in several experiments, and our results demonstrate competitive prediction performance in multi-output regression and uncertainty estimation in classification.